翻訳と辞書
Words near each other
・ Semaun
・ Semavi Eyice
・ Semavi Özgür
・ Semax
・ Semaxanib
・ Semayawi Party Ethiopia
・ Semayne's case
・ SemaZen
・ Semba
・ Sembabule
・ Sembabule District
・ Sembach
・ Sembach Kaserne
・ Sembadel
・ Sembah
Semantic heterogeneity
・ Semantic holism
・ Semantic HTML
・ Semantic integration
・ Semantic Intelligence
・ Semantic interoperability
・ Semantic Interoperability Centre Europe
・ Semantic Interoperability Community of Practice
・ Semantic interpretation
・ Semantic Interpretation for Speech Recognition
・ Semantic knowledge management
・ Semantic layer
・ Semantic lexicon
・ Semantic loan
・ Semantic mapper


Dictionary Lists
翻訳と辞書 辞書検索 [ 開発暫定版 ]
スポンサード リンク

Semantic heterogeneity : ウィキペディア英語版
Semantic heterogeneity

Semantic heterogeneity is when database schema or datasets for the same domain are developed by independent parties, resulting in differences in meaning and interpretation of data values. Beyond structured data, the problem of semantic heterogeneity is compounded due to the flexibility of semi-structured data and various tagging methods applied to documents or unstructured data. Semantic heterogeneity is one of the more important sources of differences in heterogeneous datasets.
Yet, for multiple data sources to interoperate with one another, it is essential to reconcile these semantic differences. Decomposing the various sources of semantic heterogeneities provides a basis for understanding how to map and transform data to overcome these differences.
== Classification of semantic heterogeneities ==

One of the first known classification schemes applied to data semantics is from William Kent more than two decades ago. Kent's approach dealt more with structural mapping issues than differences in meaning, which he pointed to data dictionaries as potentially solving.
One of the most comprehensive classifications is from Pluempitiwiriyawej and Hammer, "Classification Scheme for Semantic and Schematic Heterogeneities in XML Data Sources". They classify heterogeneities into three broad classes:
* ''Structural'' conflicts arise when the schema of the sources representing related or overlapping data exhibit discrepancies. Structural conflicts can be detected when comparing the underlying schema. The class of structural conflicts includes generalization conflicts, aggregation conflicts, internal path discrepancy, missing items, element ordering, constraint and type mismatch, and naming conflicts between the element types and attribute names.
* ''Domain'' conflicts arise when the semantics of the data sources that will be integrated exhibit discrepancies. Domain conflicts can be detected by looking at the information contained in the schema and using knowledge about the underlying data domains. The class of domain conflicts includes schematic discrepancy, scale or unit, precision, and data representation conflicts.
* ''Data'' conflicts refer to discrepancies among similar or related data values across multiple sources. Data conflicts can only be detected by comparing the underlying sources. The class of data conflicts includes ID-value, missing data, incorrect spelling, and naming conflicts between the element contents and the attribute values.
Moreover, mismatches or conflicts can occur between set elements (a "population" mismatch) or attributes (a "description" mismatch).
Michael Bergman expanded upon this schema by adding a fourth major explicit category of language, and also added some examples of each kind of semantic heterogeneity, resulting in about 40 distinct potential categories
. This table shows the combined 40 possible sources of semantic heterogeneities across sources:
A different approach toward classifying semantics and integration approaches is taken by Sheth et al. Under their concept, they split semantics into three forms: implicit, formal and powerful. Implicit semantics are what is either largely present or can easily be extracted; formal languages, though relatively scarce, occur in the form of ontologies or other description logics; and powerful (soft) semantics are fuzzy and not limited to rigid set-based assignments. Sheth et al.'s main point is that first-order logic (FOL) or description logic is inadequate alone to properly capture the needed semantics.

抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)
ウィキペディアで「Semantic heterogeneity」の詳細全文を読む



スポンサード リンク
翻訳と辞書 : 翻訳のためのインターネットリソース

Copyright(C) kotoba.ne.jp 1997-2016. All Rights Reserved.